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We propose a novel method for refining force-field parameters of protein systems. In this method, the 
agreement of the secondary-structure stability and instability between the protein conformations obtained by 
experiments and those obtained by molecular dynamics simulations is used as a criterion for the optimization 
of force-field parameters. As an example of the applications of the present method, we refined the force- 
field parameter set of the AMBER ff99SB force field by searching the torsion-energy parameter spaces of ip 
(N-C"-C-N) and ( (C^-C"-C-N) of the backbone dihedral angles. We then performed folding simulations of 
a-helical and /3-hairpin peptides, using the optimized force field. The results showed that the new force-field 
parameters gave structures more consistent with the experimental implications than the original AMBER 
ff99SB force field. 



I. INTRODUCTION 

In the field of molecular simulations of protein systems, 
force fields are widely used. The force field is represented 
by a sum of potential-energy terms, which are given by 
functions and their parameters based on classical me- 
chanics. There are several well-known force fields, such as 
AMBERl-^, CHARMM^ii, OPLS^, GROMOSi^l, and 
ECEPBii. When many researchers perform molecular 
simulations of protein systems, these force fields are uti- 
lized extensively. Generally, the force-field parameters 
are determined based on experimental results for small 
molecules and theoretical results using quantum chem- 
istry calculations of small peptides such as alanine dipep- 
tidc. 

In a force field, the potential energy is usually com- 
posed of the bond-stretching term, bond-bending term, 
torsion-energy term, and nonbondcd energy term. In 
these energy terms, it is known that the torsion- 
energy term is the most problematic. For instance, 
the fr94i and ff96^ versions of AMBER differ only in 
the backbone-torsion-energy parameters. Nevertheless, 
these secondary-structure-forming tendencies are quite 
differenti^rJ^. Therefore, many researchers have studied 
this energy term and its force-field parameters. Recently, 
new force-field parameters of the backbone torsion- 
energy term about (j) and tJj angles have been devel- 
oped, which are, e.g., AMBER ff99SBi, AMBER ff03&, 
CHARMM 22/CMABl and OPLS-AA/L^. Newly pro- 
posed methods of the force-field refinement also mainly 
concentrate on the torsion-energy terms. These mod- 
ifications of the torsion energy are usually based on 
quantum chemistry calculationsii"— . We have also pro- 
posed a force-field refinement method in the previous 



worksi^"— 1^"— . These methods are based on the Protein 
Data Bank (PDB) and minimize some score functions in 
the force-field parameter space. One of the methods con- 
sists of minimizing the sum of the square of the force 
acting on each atom in the proteins with the structures 
from the PDBiiri£iSii2^. Other approaches use the root- 
mean-square deviation between the original PDB struc- 
tures and the corresponding minimized structures as the 
score functions2^i2^. 

In this article, we introduce a novel optimization 
method for force-field parameters, which searches the 
lowest value of a score function defined by the sum of 
numbers of amino acids, which have secondary-structure 
conformations. As an example, we applied this approach 
to AMBER ff99SB force field and determined new force- 
field parameters for V (N-C"-C-N) and C (C^-C^-C-N) 
angles. Here, we are using the Greek letter C for the 
dihedral angle defined by C'-C^-C-N. 

In Methods section the details of the methodology are 
given. In Results and Discussion section the results of 
applications of the optimization method to the AMBER 
ff99SB force field arc presented. The last section is de- 
voted to conclusions. 



II. METHODS 

A. Force-field parameters 

The existing force fields for protein systems such as 
AMBERi-^, CHARMM^i, and OPLS^ use essentially 
the same functional forms for the potential energy i^conf 
except for minor differences. The conformational poten- 
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tial energy i^conf can be written as, 

ii'cont = Ebl + Eba + Et orsion H~ ^nonbond 7 (1) 

where, Ebl, Eba, £^torsion, and E'nonbond represent 
the bond-stretching term, the bond-bending term, the 
torsion-energy term, and the nonbonded energy term, re- 
spectively. Each force field has similar but slightly differ- 
ent parameter values. For example, the torsion energy is 
usually given by 

dihedral angle $ n 

where the first summation is taken over all dihedral an- 
gles $ (both in the backbone and in the side chains), n 
is the number of waves, 7„ is the phase, and Vn is the 
Fourier coefficient. Namely, the energy term -Etorsion has 
n, 7„, and Vn as force- field parameters. 

B. Optimization method of force-field parameters 

We now describe our new method for optimizing the 
force-field parameters. In this method, we prepare M 
protein structures, which are some experimentally deter- 
mined conformations. For these proteins, we perform 
MD simulations, which start from the experimental con- 
formations, by using a trial force field. We try to perform 
MD simulations with varied values of force-field parame- 
ters. After that, we estimate the "5" value defined by the 
following function from the trajectories of the M proteins 
obtained from the trial MD simulations: 

Here, nf~^^ is the number of the amino acids in protein i 
where their structures in PDB (initial conformation) had 
some secondary structures (such as a-helix, 3io-helix, tt- 
helix, and /3 structures) but transformed into unstruc- 
tured, coil structures without any secondary structures 
after a short MD simulation. Likewise, nY^^ is is the 
number of amino acids in protein i where their structures 
in PDB had coil structures but transformed to have some 
secondary structures after a MD simulation. Nf is the 
total number of amino acids in protein i which have some 
secondary structures in PDB, and Nj^ is the total num- 
ber of amino acids in protein i which have coil structures 
in PDB. 

When we calculate the S values for the conformations 
obtained from MD simulations by using trial force-field 
parameters, the parameter set, which yields the mini- 
mum S value, is considered to give the optimized force 
field. We remark that we want the number of proteins M 
and the total number of time steps of each MD simulation 
to be as large as possible in order to reduce the effects 
of experimental and systematic errors. In practice, the 
available computer power decides them. 



III. RESULTS AND DISCUSSION 

A. Applications of the optimization method 

We present the results of the applications of our op- 
timization method presented in the previous section to 
the AMBER ff99SB force field. We first chose 31 PDB 
files (M ~ 31) with resolution 2.0 A or better, with se- 
quence similarity of amino acid 30.0 % or lower, and with 
from 40 to 111 residues (the average number of residues 
is 86.7) from PDB-REPRDB^. Namely, the PDB IDs of 
these 31 proteins are ILDD, IHBK, 1Y02, 1I2T, 1U84, 
2ERL, ITQG, 1082, 1V54, IXAK, IGMU, 105U, INLQ, 
IWHO, ICQY, 1H75, IGMX, IIIB, IVCl, 1AY7, IKAF, 
IKPF, 1BM8, IMKO, 1EW4, lOSD, IVCC, lOPD, 
ICYO, ICTF, and 1N9L (these structures are shown 
in Fig. [T]). We added hydrogen atoms to the PDB 
coordinates by using the AMBERll program package. 
We then performed short potential energy minimizations 
while restraining the coordinates of the heavy atoms. We 
used the obtained conformations as the initial structures. 
These initial conformations were used as reference exper- 
imental structures in the definition of S in Eq. ([3]). For 
short MD simulations, the unit time step was set to 2.0 
fs and the bonds involving hydrogen were constrained by 
the SHAKE algorithm^!. Each simulation was carried 
out for 40.0 ps (hence, it consisted of 20,000 MD steps) 
by using Langevin dynamics at 300 K. The nonbonded 
cutoff of 20 A were used. As for solvent effects, we used 
the GB/SA model2^ included in the AMBER program 
package (igb = 5). For all the calculations, we used the 
AMBERll program paekage^^. 

As the target force-field parameters, we used the pa- 
rameters Vi of ijj (N-C"-C-N) and C (C^-C^-C-N) angles 
for the torsion-energy term in Eq. ^ . We performed the 
simulations by using 14 and 15 different values of the Vi 
parameters of -0 and ^, respectively, and each simulation 
with a fixed set of parameter values was repeated five 
times by changing the initial velocities of atoms in the 
31 proteins. Namely, we calculated «|^^ and n^'^^ in 
Eq. ^ as the average numbers of rq^^ and n^^^ of 
10 conformations from the last 20.0 ps (within 40.0 ps) 
for each simulation (the total number of conformations 
was thus 50 for each protein). The results are shown in 
Fig. [21 We determined the optimized force-field parame- 
ters in order of C and i/', by searching the minimum value 
of iS* in Fig. [2] The optimized Vi value for ^ is —1.60 
(while the original value is 0.20), and the optimized Vi 
value for ?/> is 0.31 (while the original value is 0.45). 

B. Test simulations of two peptides 

In order to test the validity of the force-field parame- 
ters obtained by our optimization method, we performed 
the folding simulations using two peptides, namely, C- 
peptide of ribonuclease A and the C-terminal fragment of 
the Bl domain of streptococcal protein G, which is some- 
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times referred to as G-peptide^. The C-peptide has 13 
residues and its amino-acid sequence is Lys-Glu~-Thr- 
Ala-Ala-Ala-Lys+-Phe-Glu-Arg+-Gln-His+-Met. This 
peptide has been extensively studied by experiments and 
is known to form an a-hehx structure^i^. Because 
the charges at peptide termini are known to affect he- 
hx stabihty^i^, the N and C termini of the peptide was 
blocked with acetyl and N-methyl groups, respectively. 
The G-pcptide has 16 residues and its amino-acid se- 
quence is Gly-Glu"-Trp-Thr-Tyr-Asp~-Asp"-Ala-Thr- 
Lys+-Thr-Phe-Thr-Val-Thr-Glu-. The termini were 
kept as the usual zwitter ionic states, following the ex- 
perimental conditions^i^i^. This peptide is known to 
form a /3-hairpin structure by experiments^i^>^. 

For test simulations, we used replica-exchange molec- 
ular dynamics (REMD)^. The unit time step was set to 
2.0 fs, and the bonds involving hydrogen were constrained 
by the SHAKE algorithnt^l. Each simulation was carried 
out for 30.0 ns (hence, it consisted of 15,000,000 MD 
steps) with 32 replicas by using Langevin dynamics. The 
replica exchange was tried every 3,000 steps. The tem- 
perature was distributed exponentially: 600, 585, 571, 
557, 544, 530, 517, 505, 492, 480, 469, 457, 446, 435, 
425, 414, 404, 394, 385, 375, 366, 357, 348, 340, 332, 324, 
316, 308, 300, 293, 286, and 279 K. As for solvent effects, 
we used the GB/SA model2^ included in the AMBER 
program package {igb = 5). These simulations were per- 
formed with different sets of randomly generated initial 
velocities. 

In Fig. |3j a helicity and strandness of two peptides 
obtained from the REMD simulations are shown. For 
the original AMBER ff99SB force field, the a hehcity is 
clearly larger than the strandness in not only C-peptide 
but also G-peptide. Namely, the original AMBER ff99SB 
force field clearly favors a-helix structure and does not 
favor /3 structure. On the other hand, for the optimized 
force field, in the case of C-peptide, the a helicity is larger 
than the strandness, and in the case of G-peptide, the 
strandness is larger than the a helicity. We can see that 
these results obtained from the optimized force field are 
in better agreement with the experimental results than 
the original force field. 

In Fig. m 3io helicity and tt helicity of two peptides 
obtained from the test simulations are shown. For 3io 
helicity, the optimized force field does not favor it, while 
the original force field favors it to some extent, tt helicity 
has almost no value in the both cases of the original and 
optimized force fields for the two peptides. As the both 
peptides have no 3io-hclix and 7r-helix in the experimen- 
tal results, these results of the optimized force field also 
have better secondary-structure-forming tendencies than 
those of the original force field. 

In Fig. [5l the 32 lowest-energy conformations of C- 
peptide obtained from the REMD simulations for each 
replica in the case of the original and optimized force 
fields are shown. In the case of the original force field, 
27 conformations have helix structures; 15 of them are 
a-helix and 12 are 3io-helix. In the case of the optimized 



force field, on the other hand, 14 conformations have he- 
lix structures, and all of them are a-helix. Moreover, 
five conformations are /3 structures. Both force fields fa- 
vor helix structures more than /? structures. The original 
force field also has some tendency to favor 3io-helix struc- 
tures, while the optimized force field does not. This result 
is consistent with the results of 3io helicity in Fig.H 

In Fig. m the 32 lowest-energy conformations of G- 
peptide obtained from the REMD simulations for each 
replica arc shown. As in the case of C-peptide, 27 confor- 
mations obtained from the simulations with the original 
force field have helix structures; 11 of them are a-helix, 
15 are 3io-helix, and one is 7r-helix. In the case of the op- 
timized force field, 11 conformations have helix structure 
(nine are a-helix and two are 3io-helix), and 16 confor- 
mations have /3 structures. We see that the optimized 
force field slightly favors /S structure in agreement with 
the experimental implications. 



IV. CONCLUSIONS 

In this work, we proposed a new optimization method 
of force-field parameters and, as an example, optimized 
some torsion-energy parameters of the ff99SB force field. 
This method can optimize force-field parameters using 
PDB structures. We applied these optimization method 
using 31 protein molecules from the Protein Data Bank. 
We then performed folding simulations of a-helical and 
/3-hairpin peptides. We found that the 3io helicity of the 
optimized force field decreases for both peptides and that 
the strandness of the optimized force field increases for 
/3-hairpin peptide in comparison with those of the orig- 
inal ff99SB. The results therefore showed that the opti- 
mized force-field parameters gave structures more con- 
sistent with the experimental implications than the orig- 
inal ff99SB force field. Moreover, the results also showed 
that the secondary-structurc-forming tendencies depend 
strongly on only two parameters of backbonc-torsion- 
energy term. We are now testing the validity of our 
method by the folding simulations of larger proteins. 
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FIG. 1. Structures of 31 proteins from PDB, which were used 
in the optimization of force-field parameters. The figures were 
created with VMD^. 
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FIG. 2. S values (defined in Eq. (|3])) obtained from MD simu- 
lations of 31 proteins with the force fields which have different 
Vi parameter values for C (C^-C^-C-N) (a) and iP (N-C"-C- 
N) (b) angles. 
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FIG. 3. a helicity (a-1) and strandness (a-2) of C-peptide and 
a helicity (b-1) and strandness (b-2) of G-peptide as functions 
of the residue number. Tliese values were obtained from the 
REMD simulations at 300 K. Dotted and normal curves stand 
for the original AMBER ff99SB and the optimized force field, 
respectively. 
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FIG. 4. 3io helicity (a-1) and tt helicity (a-2) of C-peptide and 
3io helicity (b-1) and tt helicity (b-2) of G-peptide as functions 
of the residue number. These values were obtained from the 
REMD simulations at 300 K. Dotted and normal curves stand 
for the original AMBER ff99SB and the optimized force field, 
respectively. 




FIG. 5. Lowest-energy conformations of C-peptide obtained 
for each replica from the REMD simulations, (a) and (b) are 
the results of the original AMBER ff99SB and the optimized 
force field, respectively. The conformations are ordered in the 
increasing order of energy. The figures were created with DS 
Visualizep21. 
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FIG. 6. Lowest-energy conformations of G-peptide obtained 
for each replica from the REMD simulations, (a) and (b) are 
the results of the original AMBER ff99SB and the optimized 
force field, respectively. The conformations are ordered in the 
increasing order of energy. The figures were created with DS 
Visualizep21. 



